TGISStatLogger: fix stats usage by tjohnson31415 · Pull Request #25 · IBM/vllm

tjohnson31415 · 2024-05-08T19:29:12Z

Cherry-pick of fix commit 6100f4b from ODH:
opendatahub-io/vllm#17

cherry-pick from opendatahub-io/vllm@6100f4b

Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>

njhill

This PR implements support for `batch size > 1 `and tracks the progress of the warmup of multiple different `prompt-length/max-decode/batch-size` shapes. ### Contributions: - Introduce env var and interpret `BATCH_SIZE` as list of values (similar to `MIN_PAD_LENGTH` and `MAX_NEW_TOKENS`) - Adapt warmup loop over the zipped list containing **pad length**, **max new tokens** and **batch size** - Support batch dimension for input arguments (tokens, positions, masks) in warmup algorithm - Add batch dimension support in update function for the attention mask (`update_mask()` in sendnn.py) - Alter test scripts to work with `batch size > 1 ` #### The code has been tested in the following settings: - On **CPU**: `batch size = 4` and `batch size = 8` with `torch.compile(backend=inductor)` - On **AIU**: `batch size = 1` in both **offline** and **online** mode. ### Open questions (including unaddressed questions from [PR23](https://github.ibm.com/ai-foundation/vllm/pull/23)): - [x] verify code functionality for `batch size = 4` and `batch size = 8` on **AIU** - [ ] ideally, the `SENDNNWorker` checks how many compiled shapes fit in AIU memory before starting to warmup all of them. Unclear how to decide, implementation missing. - [ ] How to handle requests that are too long? Right now there are just cut to maximum padding length (we should probably fail the request and inform the client) - [ ] verify output of example prompts

dtrifiro and others added 2 commits May 8, 2024 16:07

TGISStatLogger: fix stats usage

742d6d4

cherry-pick from opendatahub-io/vllm@6100f4b

format: changes from auto-formatter

e2fd7ab

Signed-off-by: Travis Johnson <tsjohnso@us.ibm.com>

tjohnson31415 force-pushed the fix-stats branch from fafff6a to e2fd7ab Compare May 8, 2024 22:09

njhill approved these changes May 8, 2024

View reviewed changes

njhill merged commit 06d9876 into main May 8, 2024

njhill deleted the fix-stats branch May 8, 2024 23:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

TGISStatLogger: fix stats usage#25

TGISStatLogger: fix stats usage#25
njhill merged 2 commits intomainfrom
fix-stats

tjohnson31415 commented May 8, 2024

Uh oh!

njhill left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

tjohnson31415 commented May 8, 2024

Uh oh!

njhill left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants